Goto

Collaborating Authors

 standard gaussian


On efficient robust regression with subquadratic samples

arXiv.org Machine Learning

We revisit the problem of robust linear regression under Gaussian covariates with an unknown covariance matrix of condition number $κ$. For this fundamental problem, significant gaps remain in our understanding of the trade-offs among sample complexity, condition number, runtime, and prediction error for efficient algorithms. Our first result is a near-linear-time algorithm that uses $\widetilde{O}(d/ε^4)$ samples, where $d$ is the dimension and $ε$ is the corruption rate, and achieves prediction error $O(\sqrt{εκ})$ under the condition $εκ\lesssim 1$, improving over all prior works. We complement this result with a Statistical Query (SQ) lower bound showing that efficient SQ algorithms achieving error $o(\sqrt{εκ})$ when $εκ\lesssim 1$ require queries that take $Ω(d^2)$ samples to simulate. Finally, we prove a low-degree polynomial lower bound that gives fine-grained evidence that, without assumptions such as $εκ\lesssim 1$, efficient algorithms may require $\tildeΩ\left(\min\{dε^{2}κ^{2},\ ε^{2}d^{2}\}\right)$ samples to significantly outperform the trivial estimator that always guesses $0$.


Statistical Query Lower Bounds for List-Decodable Linear Regression

Neural Information Processing Systems

We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set T of labeled examples (x,y) Rd R and a parameter 0 <α<1/2 such that an α-fraction of the points in T are i.i.d.



Supplementary Material Proofs from Section 2

Neural Information Processing Systems

The proof of Claim 2.3 is obtained via the following calculation, using the definition of Hermite tensor (Definition 2.2). We will use i,j for indexes in [d]. The above is equivalent to Hk(Bx) = B kHk(x). We construct the truncated distribution A as follows. We first sample x A, then we reject x unless x 2 B. Let A be the distribution of the samples we get from this process. Using Markov's inequality and union bound, we have Then it only remains to verify Ex A [Hk(x)] Ex Nm[Hk(x)] 2 for any k < d.





Estimatingtheintrinsicdimensionalityusing NormalizingFlows-Supplementary

Neural Information Processing Systems

Withtheseconditions,adirectconsequenceisthat the singular values inon-manifold directions will not depend onσ2. Hence, if we fix the latent distribution to be standard Gaussian, wehavethat theNFused tolearnqσ2 must be f forall(u,v),i.e. However, these eigenvalues are exactly in direction of large variability, i.e. in on-manifolddirection. Thiswastobeshown. Let us assume thatσ21 = = σ2d in the following. B.1 Lolipop In [11], a manifold consisting of regions of different ID was considered - a 1 dimensional line segment, and atwodimensional disk such that theoverall manfiold resembles alolipop.


SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions

Neural Information Processing Systems

We study the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. Prior work developed a general methodology to prove SQ lower bounds for this task that have been applicable to a wide range of contexts. In particular, it was known that for any univariate distribution A satisfying certain conditions, distinguishing between a standard multivariate Gaussian and a distribution that behaves like A in a random hidden direction and like a standard Gaussian in the orthogonal complement, is SQ-hard. The required conditions were that (1) A matches many low-order moments with the standard univariate Gaussian, and (2) the chi-squared norm of A with respect to the standard Gaussian is finite. While the moment-matching condition is necessary for hardness, the chi-squared condition was only required for technical reasons. In this work, we establish that the latter condition is indeed not necessary. In particular, we prove near-optimal SQ lower bounds for NGCA under the moment-matching condition only. Our result naturally generalizes to the setting of a hidden subspace. Leveraging our general SQ lower bound, we obtain near-optimal SQ lower bounds for a range of concrete estimation tasks where existing techniques provide sub-optimal or even vacuous guarantees.